Embodiments of the present invention relate generally to a method and apparatus for processing audio signals, and more particularly, to a method and apparatus for processing audio signals to improve the listening experience for a broad range of end users.
End users with xe2x80x9chigh-endxe2x80x9d or expensive equipment including multi-channel amplifiers and multi-speaker systems, currently have a limited capability to adjust the volume on the center channel signal of a multi-channel audio system independently of the audio signals on the other remaining channels. Since many movies have mostly dialog on the center channel and other sound effects located on other channels, this limited adjustment capability allows the end-user to raise the amplitude of the mostly dialog channel so that it is more intelligible during sections with loud sound effects. Currently, this limited adjustment has important shortcomings. First, it is an adjustment capability that is only available to the that have a DVD player and a multi-channel speaker system such as a six-speaker home theater system that permits volume level adjustment of all speakers independently. Also, it is an adjustment that will need to be continuously modified during transients in a preferred audio signal (e.g., voice or dialog signal) and remaining audio signal (all other channels). The final shortcoming is that voice to remaining audio (VRA) adjustments that were acceptable during one audio segment of the movie program may not be good for another audio segment if the remaining audio level increases too much or the dialog level reduces too much.
It is a fact that a large majority of end-users do not and will not have a home theater that permits this adjustment capability, i.e., Dolby Digital decoder, six-channel variable gain amplifier and multi-speaker system for many years. In addition, the end-users do not have the ability to ensure that the VRA ratio selected at the beginning of the program will stay the same for the entire program.
FIG. 3 illustrates the intended spatial positioning setup of a common home theater system. Although there are no written rules for audio production in 5.1 spatial channels, there are industry standards. As used herein, the term xe2x80x9cspatial channels refers to the physical location of an output device (e.g., speakers) and how the sound from the output device is delivered to the end user. One of these standards is to locate the majority of dialog on the center channel 226. Likewise other sound effects that require spatial positioning will be placed on any of the other four speakers labeled L 221, R 222, Ls 223, and Rs 224 for left, right, left surround and right surround. In addition, to avoid damage to midrange speakers, low frequency effects (LFE) are placed on the 0.1 channel directed toward a subwoofer speaker 225.
Digital audio compression allows the producer to provide the end-user with a greater dynamic range for the audio that was not possible through analog transmission. This greater dynamic range causes most dialog to sound too low in the presence of some very loud sound effects. The following example provides an explanation. Suppose an analog transmission (or recording) has the capability to transmit dynamic range amplitudes up to 95 dB and dialog is typically recorded at 80 dB. Loud segments of remaining audio may obscure the dialog when that remaining audio reaches the upper limit while someone is speaking. However, this situation is exacerbated when digital audio compression allows a dynamic range up to 105 dB. Clearly, the dialog will remain at the same level (80 dB) with respect to other sounds, only now the loud remaining audio can be more realistically reproduced in terms of its amplitude. User complaints that dialog levels have been recorded too low on DVD""s are very common. In fact, the dialog IS at the proper level and is more appropriate and realistic than what exists for analog recordings with limited dynamic range.
Even for consumers who currently have properly calibrated home theater systems, dialog is frequently masked by the loud remaining audio sections in many DVD movies produced today. A small group of consumers are able to find some improvement in intelligibility by increasing the volume of the center channel and/or decreasing the volume of all of the other channels. However, this fixed adjustment is only acceptable for certain audio passages and it disrupts the levels from the proper calibration. The speaker levels are typically calibrated to produce certain sound pressure level (SPL)s in the viewing location. This proper calibration ensures that the viewing is as realistic as possible. Unfortunately this means that loud sounds are reproduced very loud. During late night viewing, this may not be desirable. However, any adjustment of the speaker levels will disrupt the calibration.
A method for decoding an audio signal includes receiving a digital audio signal having a plurality of channels defined thereon, wherein one of the plurality of channels is a center channel and at least one of the other of said plurality of channels is a remaining audio channel; comparing the center channel with the at least one of the other of the plurality of channels to determine a ratio of the center channel to the other of the plurality of channels; and automatically adjusting the center channel and the at least one of the plurality of other channels when a predetermined value for the ratio is not met.